Balancing Exploration and Exploitation in Classical Planning

نویسندگان

Tim Schulte

Thomas Keller

چکیده

Successful heuristic search planners for satisficing planning like FF or LAMA are usually based on one or more best first search techniques. Recent research has led to planners like Arvand, Roamer or Probe, where novel techniques like Monte-Carlo Random Walks extend the traditional exploitation-focused best first search by an exploration component. The UCT algorithm balances these contradictory incentives and has shown tremendous success in related areas of sequential decision making but has never been applied to classical planning yet. We make up for this shortcoming by applying the Trial-based Heuristic Tree Search framework to classical planning. We show how to model the best first search techniques Weighted A and Greedy Best First Search with only three ingredients: action selection, initialization and backup function. Then we use THTS to derive four versions of the UCT algorithm that differ in the used backup functions. The experimental evaluation shows that our main algorithm, GreedyUCT, outperforms all other algorithms presented in this paper, both in terms of coverage and quality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploration in relational domains for model-based reinforcement learning

A fundamental problem in reinforcement learning is balancing exploration and exploitation. We address this problem in the context of model-based reinforcement learning in large stochastic relational domains by developing relational extensions of the concepts of the E and R-MAX algorithms. Efficient exploration in exponentially large state spaces needs to exploit the generalization of the learne...

متن کامل

Model based Bayesian Exploration

Reinforcement learning systems are often concerned with balancing exploration of untested actions against exploitation of actions that are known to be good. The benefitof exploration can be estimated using the classical notion of Value of Information — the expected improvement in future decision quality arising from the information acquired by exploration. Estimating this quantity requires an a...

متن کامل

Best-First Width Search: Exploration and Exploitation in Classical Planning

It has been shown recently that the performance of greedy best-first search (GBFS) for computing plans that are not necessarily optimal can be improved by adding forms of exploration when reaching heuristic plateaus: from random walks to local GBFS searches. In this work, we address this problem but using structural exploration methods resulting from the ideas of width-based search. Width-based...

متن کامل

Balancing Exploration and Exploitation in Alliance Formation

Do firms balance exploration and exploitation in their alliance formation decisions and, if so, why and how? We argue that absorptive capacity and organizational inertia impose conflicting pressures for exploration and exploitation with respect to the value chain function of alliances, the attributes of partners, and partners’ network positions. Although path dependencies reinforce either explo...

متن کامل

Multiobjective Automatic Parameter Calibration of a Hydrological Model

This study proposes variable balancing approaches for the exploration (diversification) and exploitation (intensification) of the non-dominated sorting genetic algorithm-II (NSGA-II) with simulated binary crossover (SBX) and polynomial mutation (PM) in the multiobjective automatic parameter calibration of a lumped hydrological model, the HYMOD model. Two objectives—minimizing the percent bias a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Balancing Exploration and Exploitation in Classical Planning

نویسندگان

چکیده

منابع مشابه

Exploration in relational domains for model-based reinforcement learning

Model based Bayesian Exploration

Best-First Width Search: Exploration and Exploitation in Classical Planning

Balancing Exploration and Exploitation in Alliance Formation

Multiobjective Automatic Parameter Calibration of a Hydrological Model

عنوان ژورنال:

اشتراک گذاری